On the limitations of voice conversion techniques in emotion identification tasks
نویسنده
چکیده
The growing interest in emotional speech synthesis urges effective emotion conversion techniques to be explored. This paper estimates the relevance of three speech components (spectral envelope, residual excitation and prosody) for synthesizing identifiable emotional speech, in order to be able to customize voice conversion techniques to the specific characteristics of each emotion. The analysis has been based on a listening test with a set of synthetic mixed-emotion utterances that draw their speech components from emotional and neutral recordings. Results prove the importance of transforming residual excitation for the identification of emotions that are not fully conveyed through prosodic means (such as cold anger or sadness in our Spanish corpus).
منابع مشابه
On the limitations of voice conversion techniques in emotion identification tasks
The growing interest in emotional speech synthesis urges effective emotion conversion techniques to be explored. This paper estimates the relevance of three speech components (spectral envelope, residual excitation and prosody) for synthesizing identifiable emotional speech, in order to be able to customize the voice conversion techniques to the specific characteristics of each emotion. The ana...
متن کاملUsing Context-based Statistical Models to Promote the Quality of Voice Conversion Systems
This article aims to examine methods of optimizing GMM-based voice conversion systems performance in which GMM method is introduced as the basic method for improvement of voice conversion systems performance. In the current methods, due to using a single conversion function to convert all speech units and subsequent spectral smoothing arising from statistical averaging, we will observe quality ...
متن کاملطراحی یک روش آموزش ناموازی جدید برای تبدیل گفتار با عملکردی بهتر از آموزش موازی
Introduction: The art of voice mimicking by computers, has with the computer have been one of the most challenging topics of speech processing in recent years. The system of voice conversion has two sides. In one side, the speaker is the source that his or her voice has been changed for mimicking the target speaker’s voice (which is on the other side). Two methods of p...
متن کاملEffects of Textually-Enhanced Reading Tasks and Strategic Pre-Task Planning on Learning English Passive Voice
In the realm of second language acquisition (SLA), task-based language teaching (TBLT) and input enhancement (IE) have been the focus of a great number of studies. However, the idea of investigating the effects of focus on form instruction through input-enhanced tasks along with pre-task planning time as one of the features of task-based language teaching has been rarely explored in the field o...
متن کاملBehavioral Analysis of Traffic Flow for an Effective Network Traffic Identification
Fast and accurate network traffic identification is becoming essential for network management, high quality of service control and early detection of network traffic abnormalities. Techniques based on statistical features of packet flows have recently become popular for network classification due to the limitations of traditional port and payload based methods. In this paper, we propose a metho...
متن کامل